Volyn Oblast
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- South America > Colombia > Meta Department > Villavicencio (0.04)
- (12 more...)
- Overview (1.00)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
- Law (1.00)
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- (2 more...)
LL-ViT: Edge Deployable Vision Transformers with Look Up Table Neurons
Nag, Shashank, Bacellar, Alan T. L., Susskind, Zachary, Jha, Anshul, Liberty, Logan, Sivakumar, Aishwarya, John, Eugene B., Kailas, Krishnan, Lima, Priscila M. V., Yadwadkar, Neeraja J., Franca, Felipe M. G., John, Lizy K.
Vision Transformers have been tremendously successful in computer vision tasks. However, their large computational, memory, and energy demands are a challenge for edge inference on FPGAs -- a field that has seen a recent surge in demand. We recognize the benefits of recent works on logic and Look Up Table (LUT) based networks, such as LogicNets, NeuraLUT, DWN, among others, in offering models that simultaneously reduce both the memory and compute footprints. However, these models natively do not perform well on common vision tasks, such as CIFAR-10/100. In this work, we propose LL-ViT, a novel edge optimized vision transformer design that integrates layers of LUT neurons within the transformer architecture. Based on our characterization that reveals that a majority of model weights and computations are from the channel mixer (MLP layer), we design an alternate LUT-based channel mixer, and simultaneously develop an FPGA-based accelerator for LL-ViT. Contrary to some attempts to replace each multiplication with a table lookup, our architecture utilizes a neural learning approach which natively learns the LUT functions. This approach allows for reduced model sizes, and a computational and energy-efficient inference solution for vision transformer models. Evaluating on edge-suitable workloads, we achieve accuracies of 95.5% on CIFAR-10, 78.8% on CIFAR-100, and 60.9% on Tiny-ImageNet datasets, comparable to the baseline transformer. LL-ViT eliminates over 60% of the model weights and 50% of the multiplications in the model, and achieves 1.9x energy efficiency and 1.3x lower latency over an integer quantized ViT accelerator, while also offering superior throughput against prior works at a 10.9W power budget.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Ukraine > Volyn Oblast > Luts'k (0.04)
- South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
- (3 more...)
- Information Technology (0.93)
- Energy (0.66)
Sub-microsecond Transformers for Jet Tagging on FPGAs
Laatu, Lauri, Sun, Chang, Cox, Arianna, Gandrakota, Abhijith, Maier, Benedikt, Ngadiuba, Jennifer, Que, Zhiqiang, Luk, Wayne, Spiropulu, Maria, Tapper, Alexander
We present the first sub-microsecond transformer implementation on an FPGA achieving competitive performance for state-of-the-art high-energy physics benchmarks. Transformers have shown exceptional performance on multiple tasks in modern machine learning applications, including jet tagging at the CERN Large Hadron Collider (LHC). However, their computational complexity prohibits use in real-time applications, such as the hardware trigger system of the collider experiments up until now. In this work, we demonstrate the first application of transformers for jet tagging on FPGAs, achieving $\mathcal{O}(100)$ nanosecond latency with superior performance compared to alternative baseline models. We leverage high-granularity quantization and distributed arithmetic optimization to fit the entire transformer model on a single FPGA, achieving the required throughput and latency. Furthermore, we add multi-head attention and linear attention support to hls4ml, making our work accessible to the broader fast machine learning community. This work advances the next-generation trigger systems for the High Luminosity LHC, enabling the use of transformers for real-time applications in high-energy physics and beyond.
- North America > United States > California (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > Ukraine > Volyn Oblast > Luts'k (0.04)
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Bhasha-Rupantarika: Algorithm-Hardware Co-design approach for Multilingual Neural Machine Translation
Lokhande, Mukul, Dewangan, Tanushree, Mansoori, Mohd Sharik, Chaudhari, Tejas, J., Akarsh, Lokhande, Damayanti, Teman, Adam, Vishvakarma, Santosh Kumar
This paper introduces Bhasha-Rupantarika, a light and efficient multilingual translation system tailored through algorithm-hardware codesign for resource-limited settings. The method investigates model deployment at sub-octet precision levels (FP8, INT8, INT4, and FP4), with experimental results indicating a 4.1x reduction in model size (FP4) and a 4.2x speedup in inference speed, which correlates with an increased throughput of 66 tokens/s (improvement by 4.8x). This underscores the importance of ultra-low precision quantization for real-time deployment in IoT devices using FPGA accelerators, achieving performance on par with expectations. Our evaluation covers bidirectional translation between Indian and international languages, showcasing its adaptability in low-resource linguistic contexts. The FPGA deployment demonstrated a 1.96x reduction in LUTs and a 1.65x decrease in FFs, resulting in a 2.2x enhancement in throughput compared to OPU and a 4.6x enhancement compared to HPTA. Overall, the evaluation provides a viable solution based on quantisation-aware translation along with hardware efficiency suitable for deployable multilingual AI systems. The entire codes [https://github.com/mukullokhande99/Bhasha-Rupantarika/] and dataset for reproducibility are publicly available, facilitating rapid integration and further development by researchers.
- Asia > India (0.15)
- North America > United States (0.04)
- Europe > Ukraine > Volyn Oblast > Luts'k (0.04)
- Asia > Middle East > Israel (0.04)
- Information Technology (0.47)
- Government (0.46)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Guangdong Province > Guangzhou (0.04)
- South America > Colombia > Meta Department > Villavicencio (0.04)
- (12 more...)
- Overview (1.00)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.46)
- Law (1.00)
- Information Technology (1.00)
- Health & Medicine > Therapeutic Area (1.00)
- (2 more...)
Russian drone and missile attack on Ukraine kills one, wounds 15
At least one person has been killed and 18 others wounded in a Russian drone and missile attack on Ukraine, officials said, as Moscow launched its largest attack on its neighbour in weeks amid an ongoing diplomatic push for a ceasefire. Russian forces launched 574 drones and 40 missiles overnight, Ukraine's Air Force said on Thursday, adding that its air defence units had downed most of the attacks. But a number of the attacks struck targets in several locations across Ukraine, resulting in casualties and damage to buildings. In the western city of Lviv, about 70km (43 miles) from the border with Poland, a drone and missile attack killed one person, injured three and damaged 26 residential buildings, Governor Maksym Kozytskyi said. In Mukachevo, near the border with Hungary and Slovakia, 15 people were wounded in Russian attacks, local authorities said.
- Asia > Russia (0.99)
- Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.28)
- Europe > Hungary (0.27)
- (10 more...)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
- Government > Regional Government > Europe Government > Russia Government (0.70)
- Government > Regional Government > Asia Government > Russia Government (0.70)
Generating Multi-Table Time Series EHR from Latent Space with Minimal Preprocessing
Cho, Eunbyeol, Kim, Jiyoun, Lee, Minjae, Park, Sungjin, Choi, Edward
Electronic Health Records (EHR) are time-series relational databases that record patient interactions and medical events over time, serving as a critical resource for healthcare research and applications. However, privacy concerns and regulatory restrictions limit the sharing and utilization of such sensitive data, necessitating the generation of synthetic EHR datasets. Unlike previous EHR synthesis methods, which typically generate medical records consisting of expert-chosen features (e.g. a few vital signs or structured codes only), we introduce RawMed, the first framework to synthesize multi-table, time-series EHR data that closely resembles raw EHRs. Using text-based representation and compression techniques, RawMed captures complex structures and temporal dynamics with minimal preprocessing. We also propose a new evaluation framework for multi-table time-series synthetic EHRs, assessing distributional similarity, inter-table relationships, temporal dynamics, and privacy. Validated on two open-source EHR datasets, RawMed outperforms baseline models in fidelity and utility. The code is available at https://github.com/eunbyeol-cho/RawMed.
- Information Technology > Databases (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
- Information Technology > Data Science > Data Mining (0.88)
NATO jets scrambled amid Russia's largest drone attack on Ukraine
President Donald Trump says the U.S. will have to send more weapons to Ukraine, just days after Pentagon paused critical weapons deliveries to Kyiv. NATO jets were scrambled overnight as Russia carried out its largest drone attack yet on Ukraine, launching more than 700 drones, officials said. Ukrainian President Volodymyr Zelenskyy said the "new massive Russian attack on our cities" involved "728 drones of various types, including over 300 Shaheds, and 13 missiles – Kinzhals and Iskanders. "Most of the targets were shot down. Our interceptor drones were used -- dozens of enemy targets were downed, and we are scaling up this technology.
- Asia > Russia (1.00)
- North America > United States (0.37)
- Europe > Ukraine > Kyiv Oblast > Kyiv (0.29)
- (11 more...)
- Government > Military (1.00)
- Government > Regional Government > Europe Government > Russia Government (0.39)
- Government > Regional Government > Asia Government > Russia Government (0.39)
- Government > Regional Government > North America Government > United States Government (0.37)
DPUV4E: High-Throughput DPU Architecture Design for CNN on Versal ACAP
Li, Guoyu, Zheng, Pengbo, Weng, Jian, Yang, Enshan
--Convolutional Neural Networks (CNNs) remain prevalent in computer vision applications, and FPGAs, known for their flexibility and energy efficiency, have become essential components in heterogeneous acceleration systems. AMD's V ersal ACAP architecture, tailored for AI applications, incorporates AI Engines (AIEs) to deliver high computational power . Nevertheless, the platform suffers from insufficient memory bandwidth, hindering the full utilization of the AIEs' theoretical performance. We design two computation units, Conv PE and DWC PE, to support different computational patterns. Each computation unit's data flow efficiently utilizes the data reuse opportunities to mitigate bandwidth bottlenecks. Additionally, we extend the functionality of each PE to utilize AIEs for non-convolutional operations, reducing resource overhead. Experiments on over 50 models show that compared to previous designs, our design provides 8 . At present, deep learning (DL) has profoundly integrated into our daily lives. Despite the emergence of new transformer-based neural networks, Convolutional Neural Networks (CNN) remain extensively employed owing to their proficiency in extracting local information from images in relatively smaller datasets. GPUs' efficient parallel processing is used to improve CNN inference, but their general-purpose design reduces energy efficiency. To improve accelerators' energy efficiency and throughput, custom CNN architectures have been proposed.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- (13 more...)
At least 3 killed in Russia's 'most powerful' attack on Ukraine's Kharkiv
At least five people have been killed and more than 20 wounded as Russia launched a barrage of missiles, drones and bombs across Ukraine, officials said. The Ukrainian air force said on Saturday that Russia struck with 215 missiles and drones overnight, and Ukrainian air defences shot down and neutralised 87 drones and seven missiles. At least three people were killed and 17 others, including two children, were wounded in the northeastern city of Kharkiv, Mayor Ihor Terekhov said, describing the assault as "the most powerful" on the city since Russia launched its full-scale invasion of Ukraine in 2022. He reported 48 Iranian-made drones, two missiles and four guided bombs were fired before dawn at the city of 1.4 million people, located just 50km (30 miles) from the Russian border. "Drones are still circling above," Terekhov wrote on Telegram at 4:40am (01:40 GMT), as air raid sirens wailed across the city. Residential buildings and civilian infrastructure were heavily damaged.
- Asia > Russia (1.00)
- Europe > Ukraine > Kharkiv Oblast > Kharkiv (0.62)
- North America > United States (0.34)
- (7 more...)
- Government > Military > Air Force (0.74)
- Government > Regional Government > Europe Government > Russia Government (0.34)
- Government > Regional Government > Asia Government > Russia Government (0.34)